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Abstract In spite of the recent improvements in the performance of the solvers 
based on the DPLL procedure, it is still possible for the search algorithm to fo- 
cus on the wrong areas of the search space, preventing the solver from returning 
a solution in an acceptable amount of time. This prospect is a real concern e.g. 
in an industrial setting, where users typically expect consistent performance. To 
overcome this problem, we propose a framework that allows learning and using 
domain-specific heuristics in solvers based on the DPLL procedure. The learn- 
ing is done off-line, on representative instances from the target domain, and the 
learned heuristics are then used for choice-point selection. In this paper we focus 
on Answer Set Programming (ASP) solvers. In our experiments, the introduc- 
tion of domain-specific heuristics improved performance on hard instances by up 
to 3 orders of magnitude (and 2 on average), nearly completely eliminating the 
cases in which the solver had to be terminated because the wait for an answer had 
become unacceptable. 



1 Introduction 

In recent years, solvers based on the DPLL procedure I11J21 have become amazingly fast. 
Mostly, that is due to good heuristics that direct the search toward the most promising 
areas of the search space, and to learning algorithms that discover features of the search 
space on-the-fly. Unfortunately, when the search space is sufficiently large, it is still 
possible for the search algorithm to mistakenly focus on areas of the search space that 
contain no solutions or very few. When that happens, performance degrades substan- 
tially, even to the point that the solver may need to be terminated before returning an 
answer This prospect is a real concern when one is considering using such a solver 
in an industrial application, in which the solver will act as part of a black-box from 
which users typically expect consistent performance. It should be noted that the phe- 
nomenon of performance degradation is often due to the fact that the heuristics used in 
choice-point selection are general-purpose, and thus may not fit well a given domain. 

Various methods have been proposed in the literature to improve solver stability. A 
basic technique involves using parametrized general-purpose heuristics, for which the 
user can manually specify parameter values that are suitable for the domain of inter- 
est. An interesting step in that direction is provided by CLASPFOLIO [,3], which makes 



use of different configurations of the CLASP solver |4 |. The program in input is auto- 
matically analyzed, and the most promising configuration of CLASP is selected accord- 
ingly. A different approach consists in having the solver adapt to the problem in input 
at run-time by means of learning. This is the case of the clause learning and conflict 
learning techniques that have recently become very popular especially in SAT and ASP 
solvers (see e.g. fSHl), and have brought about substantial performance improvements. 
The idea behind these learning techniques is to record information about the conflicts 
that are detected during the exploration of the search space, and to use the information 
to avoid descending similar branches later Hence, the basic heuristic is still general- 
purpose, but it is adjusted, during execution, depending on the features of the problem 
in input. One drawback of this approach is that the learning is limited to the current 
program, and the information that has been learned in one run cannot be used in later 
runs of the solver 

In this paper we propose a framework, called DORS, which instead allows learn- 
ing and using domain-specific heuristics in solvers based on the DPLL procedure. The 
results of learning are retained and can be used in later runs . In fact, the learning tech- 
nique relies on the availability of information from multiple runs of the solver Although 
here we focus on solvers for Answer Set Programming (ASP) f6.'71, the DORS frame- 
work can be applied to any solver based on the DPLL procedure, including SAT and 
constraint solvers. Furthermore, the current framework is aimed at improving the effi- 
ciency of the computation of one model of consistent programs, but could be extended 
further The learning is done off-line, on representative instances from the target do- 
main. The particular learning technique used here is intendedly extremely simple, but 
already shows remarkable performance improvements. In experimental evaluation, the 
use of our technique improved performance on hard instances by up to 3 orders of mag- 
nitude (and 2 on average on industrial problems), nearly completely eliminating the 
situations in which the solver had to be terminated because the wait for an answer had 
become unacceptable. 

This paper is organized as follows. In the next section we give some background 
on ASP. Next, we discuss the basic search algorithm used in most ASP solvers. Then, 
in Section]?] we present the DORS framework. Experimental results are discussed in 
Section|5j FinaUy, in Section|6l we draw conclusions. 

2 Answer Set Programming 

We define the syntax of the language precisely, but only give its informal semantics 
in order to save space. We refer the reader to [6.8] for a specification of the formal 
semantics. Let Z' be a signature containing constant, function and predicate symbols. 
Terms and atoms are formed as usual in first-order logic. A (basic) literal is either an 
atom a or its strong (also called classical or epistemic) negation -^a. The set of literals 
formed from S is denoted by lit{S). A rule is a statement of the form: 

hi V ... V hk ^ h, ■ . ■ Jm,not Im+i, ■ ■ ■ ,not In 

where hi's and Z^'s are ground literals and not is the so-called default negation. The 
intuitive meaning of the rule is that a reasoner who believes {li, . . . , Zm} and has no 



reason to believe {Im+i, . . . , In}, has to believe one of hi's. The part of the statement 
to the left of <— is called head; the part to its right is called body. Symbol ^ can be 
omitted if no /^'s are specified. Often, rules of the form h ^ not h,li, . . . ,not Z„ are 
abbreviated into ^ h, ■ ■ ■ ,not /„, and called constraints. The intuitive meaning of a 
constraint is that its body must not be satisfied. A rule containing variables is interpreted 
as the shorthand for the set of rules obtained by replacing the variables with all the pos- 
sible ground terms (called grounding of the rule). A program is a pair {S, 11), where 
i7 is a signature and 77 is a set of rules over S. We often denote programs just by the 
second element of the pair, and let the signature be defined implicitly. In that case, the 
signature of 77 is denoted by Finally, an answer set (or model) of a program 77 is 

a collection of its consequences under the answer set semantics. Because a convenient 
representation of alternatives is often important in the formalization of knowledge, the 
language of ASP has been extended with constraint literals [8], which are expressions 
of the form m{li, I2, . . . , lk}n, where m and n are arithmetic expressions and Z/s are 
basic literals as defined above. A constraint literal is satisfied whenever the number of 
literals that hold from {Zi, . . . , /fe} is between m and n, inclusive. Using constraint lit- 
erals, the choice between p and q, under some set of conditions 7^, can be compactly 
encoded by the rule l{p, q}l ^ F. A rule of this form is called choice rule. When solv- 
ing sets of problems from a given domain of interest, ASP programs are often divided 
into a domain description and a problem instance. Intuitively, the domain description 
encodes a description of the problem domain and of the solutions, while each problem 
instance encodes a different problem from the domain. 

3 Search in ASP Solvers 

The search algorithm used by many ASP solvers (e.g. SMODELS |9), DLV L10| ) is based 
on the DPLL procedure HI 121 . The basic algorithm for the computation of a single an- 
swer set, which we will later refer to as standard algorithm, is show in Figure [T] The 



function solve ( IJ : Program , A : Ser of Extended Literals j 
B := expand(ir, A) ; 

if {B is answer set of U) then return B; 

if {B is not consistent or B is complete) then return _L; 
e choose_literal (/7, B) } 
B' := solve (/7, BU{e}); 

if (B' = _L) then B' := solve (77, B U {not(e)}) ; 
return B' ; 



Figure 1. Basic Search Algorithm for ASP 



algorithm is based on the idea of growing a particular set of (ground) literals, often 
called partial answer set, until it is either shown to be an answer set of the program, 
or it becomes inconsistent. To achieve this, guesses have to be made as to which lit- 
erals may be in the answer set. Let us now describe the algorithm more precisely. By 
extended literal we mean a literal / or the expression not I, intuitively meaning that I is 



known not to hold in the answer set (but its complement, 7, may or may not hold). Given 
an extended literal e, not{e) denotes the expression not I if e — I and it denotes I if 
e = not I. Algorithm solve takes as input program, 77, and partial answer set. A, which 
is a set of extended literals. A is initially empty. Function expand [9| is then used to 
add to the partial answer set all the literals that must hold given 77 and A. If the result 
of expand is an answer set of 77, the algorithm returns it (and terminates). If instead a 
contradiction is discovered, then the algorithm returns no model (±). In all other cases, 
the partial answer set is still incomplete but consistent. Then, function chooseJiteral 
selects an extended literal e such that neither e nor not{e) occur in B. This is called 
the choice literal or choice point. The algorithm then calls itself recursively in order to 
find an answer set of 77 from the partial answer set B U {e}. If one such answer set is 
found, then the algorithm returns it. If instead no answer set is found, then the algorithm 
attempts to find an answer set of 77 that contains B U {not{e)}. If the attempt succeeds, 
the answer set is returned. Otherwise, the algorithm returns no model (_L). 

It is not difficult to see how the choices made by chooseJiteral greatly influence 
the number of choice points picked by the algorithm, and ultimately its performance. 
Consider for example the program: 

p <~ not q. q not p. 
r. 

^ p, r. 
p _ ] -i- 9,not s. 

u{X) <- i(X),not v{X). 
v{X) <- i(X),not u{X). 

^ t{0). t{l). . . . t(lOOO). 

The program is clearly inconsistent. In fact, the first two rules force either p or q to hold, 
but the next three rules forbid p and q from holding. So, if the first call to chooseJiteral 
were to select e.g. not p, then the following call to expand would conclude that q 
must hold, and that inconsistency follows (since s is not defined by any rule and thus 
the body of the corresponding constraint is satisfied). The algorithm would then back- 
track and select p. This time, expand would derive inconsistency from the fact that 
the body of the first constraint is satisfied. Hence, the algorithm would return ± (no 
model). However, consider what would happen if chooseJiteral were to select u(0) 
instead of not p. Function expand would derive the consequence not v{0) and fail to 
reach inconsistency. Then, the algorithm would recurse, and possibly select say u{l). 
As before, expand would not detect any inconsistency, and allow the algorithm to re- 
curse again. Suppose now chooseJiteral were to pick not p. Following the same steps 
outlined earlier, the algorithm would derive inconsistency. Upon backtracking, the al- 
gorithm would also derive inconsistency from the selection of p. However, the finding 
would only affect the current branch of the search stemming from the selection of 
and the algorithm would then backtrack, select not u{l), and recurse. At this point, the 
algorithm would be again free to select any of the remaining u{X) literals, which from 
an intuitive point of view means going in the wrong direction. Even if the algorithm 
were to select not p right away, it would still have to backtrack over the choice of u(0) 
and explore the corresponding branch of the search tree that starts from not u(0) before 
finally concluding that the program is inconsistent. The reader can imagine the effect 



on the algorithm's performance if choose Jiteral were to choose not p at an even later 
point in the search process. 

In order to reduce the chances of chooseJiteral making "wrong" selections, mod- 
ern solvers base literal selection on carefully designed heuristics. For example, in 
SMODELS the selection is roughly based on maximizing the number of consequences 
that can be derived after selecting the given extended literal [9 |. These techniques work 
well in a number of cases, but not always. In fact, particular features of the program can 
confuse the heuristics. When that happens at an early stage of the search process, the 
effect is often disastrous, causing the solver to fail to return an answer in an acceptable 
amount of time. Particularly frustrating is the fact that the efficiency of the heuristics 
may change largely in correspondence of small elaborations of the program in input. 
For example, the chooseJiteral heuristics may make good selections for one problem 
instance, while they may cause the search to take an unacceptable amount of time for a 
not-too-different problem instance. 

As we mentioned in the introduction, one way to limit the effect of wrong selections 
by chooseJiteral is that of allowing the solver to learn about relevant conflicts at run- 
time. Once learned, the information about conflicts can be used for the early pruning of 
other branches of the search space (e.g. [5,4]). Although this technique has proven to 
be extremely effective, it does not address directly the issue of chooseJiteral making 
wrong choices, but rather curbs the problem by making some of those choices impos- 
sible after learning has taken place, or by allowing to quickly backtrack after a wrong 
choice has been made. Furthermore, because the learning occurs at run-time, during the 
initial phase of the computation in which learning has not yet occurred, chooseJiteral 
may once again affect efficiency negatively by taking the search process in the wrong 
direction. Finally, whatever has been learned in one execution of the algorithm is dis- 
carded upon termination, and cannot be used in later runs. 

In the next section, we describe a different approach, aimed at improving directly 
the selections made by chooseJiteral and at retaining what the algorithm has learned. 

4 The DORS Framework 

Our technique for learning domain-specific heuristics and using them for literal selec- 
tion applies to the situation in which one is interested in solving a number of problem 
instances from a given problem domain. Such situations are very common in the ASP 
community - see e.g. the Second Answer Set Programming Competition ifTTl . More- 
over, this is particularly the case in industrial applications, where the application con- 
tains the domain description, and the user describes the instance using some interface 
(refer e.g. to ifTZl ). which then automatically encodes the problem instance. 

Program Pi, shown earlier, can be viewed as consisting of a domain description 
and a problem instance: the first 7 rules constitute the former, while the definition of 
predicate t is the problem instance. A different problem instance might then define t 
as {t{5), t{6), t{7)}. In this case, it is obvious that a good strategy for the selection 
of the literals consists in first choosing among {p, not p, q, not q} and only later (if 
necessary) considering the extended literals formed by u and v. 



In general, the domain-specific heuristics for chooseJiteral will be learned - rather 
than manually specified - by analyzing the choices made by the standard solver solve 
when solving representative problem instances from the domain. This approach is par- 
ticularly useful in applications in which a number of problem instances from the same 
class of problems will have to be solved over time - for example, in the setting of an 
industrial application, or in a programming/solver competition in which benchmark- 
ing is involved - and computational power is available off-line to allow learning the 
domain-specific heuristics (e.g. before deploying the application, or before submitting 
the solver or solutions to a competition). 

Next, we discuss how choices made in previous runs of the algorithm can be ex- 
tracted and combined for future use. The final result will be the learning of a policy 
(see e.g. (13] for a comprehensive introduction on the topic), that is, in general terms, 
of a mapping from states to probabilities of selecting each available action. To achieve 
this, the algorithm from Figure [Tjis modified to maintain a record of the choice points 
selected, and to return the list of such choice points together with the answer set, when 
one is found. The modified algorithm is shown in Figure |2] In the algorithm, the fist of 



function SOlvecp ( IJ : Program , A : Set of Extended Literals, S : Ordered List of Extended Literals ) 
B := expand (il, A) ; 

if [B is answer set of U) tlien return {B.S); 

if iB is not consistent or B is complete) tlien return _L; 

e :^ choose.literal (il, B] ; 

{B',S') := solve (/7, Bu{e}, S o e) ; 

if iB' ^ ±) then return (B',S'); 

(B',S ) := solve (/7, BU{not{e)}, Sonot(e)); 

return (B', S'); 



Figure 2. Search Algorithm for ASP with Explicit Tracking of Choice Points 



choice points is stored in variable S. Symbol o represents concatenation. When solvecp 
is initially invoked, S is the empty list. 

Now we turn our attention to combining the information collected by solvecp into a 
domain-specific heuristics. Given the domain description M and a problem instance / 
that is to be used to learn the domain-specific heuristics, the decision-sequence of / (de- 
noted by d(/)) is 1 if W^;ecp(/UM, 0, 0) = ± and S if solvecp{lUM, 9, IJ)) = {A,S) 
for some A. From now on, given a decision-sequence d, we denote its n*'* element by 
dn- Moreover, given an extended literal e, level{e,d) denotes the index i such that 
di — e (e is guaranteed not to occur at more than one position by construction of the 
decision-sequence in solvecp). Intuitively, level{e,d) represents the level in the deci- 
sion tree at which e was selected. Notice that, by construction of the sequence of choice 
points in solvecp, if d{I) ^ ±, then d(l) only enumerates the choice points that led 
directly to the answer sets. All the choice points that did not lead directly to it, in the 
sense that they were later backtracked upon, are in fact discarded every time the algo- 
rithm backtracks. 

In order to improve the quality of the learned heuristics, we divide the class of prob- 
lem instances in subclasses, and associate with each problem instance / an expression 



a denoting the subclass it belongs to. The intuition is that using subclasses allows to 
further tailor the literal selection heuristics to the particular features of the problem in- 
stances. For example, in a planning domain, cr might be the maximum length of the plan 
(often called lasttime or maxtime in ASP-based planning). The subclass of a problem 
instance I is denoted by o'(/). 

Let T denote the set of all problem instances that will be used for the learning of 
the domain-specific heuristics. Next, we specify a way to determine how many times 
an extended literal e was selected at a certain level of the decision-sequences for the 
problem instances in X. More precisely, given a positive integer 6, called the scaling 
factor, and subclass a, the occurrence count of an extended literal e w.r.t. a level I and 
set of instances I is 

05,^(e, = I { / I / G I A cr(/) = cr A d{I) ^ LA 
I- 5/2<index{e,d{I)) <l + 5/2}\. 

The scaling factor 5 allows taking into account all the occurrences of e at a level in the 
interval [/ — 5/2, 1 + 5/2). If 5 — 1, then only the occurrences of e with level equal to 
I are considered. Values of 5 greater than 1 can be useful in those cases in which all or 
most permutations of a subsequence of choice points lead to an answer set. 

Let now E = {ei, 62, . . . , 6^} be a set of extended literals, representing possible 
choice points at some level I of the decision tree. The set of best choice points among 

bests{l,E,a,I) = {e\ee E aW EE os,a{e,l,I) > os,a{(^' ,1,1)}. 

Intuitively, bests{l, E, a, 2) returns the choice points that, if taken at level I, are most 
likely to lead to an answer set without backtracking, based on the information col- 
lected about the instances of subclass cr in I. Algorithms for the computation of 
bests{l, E, a, I) and os,a{s, l,^) ^6 simple and are omitted to save space. 

Function bests{l, E, cr,I) encodes the essence of the domain-specific heuristics, or, 
more precisely, the policjy for the selection of choice points. Algorithm choosejiteral 
can now be extended to perform literal selection guided by the domain-specific 
heuristics. The modified algorithm, choose JiteraLdspec, is shown in Figure |3] In 
choose JiteraLdspec, argument T is the set of extended literals that have previously 
been selected by choose JiteraLdspec. If bests{level,E' ,(j{I),I) is the empty set, 
then chooseJiteral-dspec falls back to performing standard extended literal selection 
via choosejiteral. This is for instances in which the learned heuristics do not prescribe 
any extended literal for the current decision level, or in which all the extended literals 
that the learned heuristics prescribed have already been tried. Modifying the standard 
solver's algorithm in order to use the domain-specific heuristics for choice-point selec- 
tion is rather straightforward. A simple version, which for the most part follows the 
well-known iterative version of the SMODELS algorithm, is shown in Figure|4] 

Next, we describe how grounding is handled in the DORS framework. The dis- 
cussion is based on the architecture of the LPARSEH-SMODELS system but can be ex- 
tended to other ASP systems as well. ASP solvers typically expect in input ground 

' We assume uniform probability of selection among the elements of the set returned by 

hestsil, E, a, X). 



function choose.literaLdspec ( IJ -. Program, a: Problem Subclass, A : Set of Extended Literals, 

level : Integer /* Current Level in the Decision Tree */, 
T ; Set of Extended Literals, X : Set of Instances, 
S : Integer Scaling Factory ) 

L := lit(S(n)); E := L\J {not l\l e L} ; 

E' = 0; 

for each e ^ E 

if (e ^ A A not(e) ^ A A e ^ T) tlien E' := E' U {e}; 

end for 

B :— bests {level, E' , a, X) ; 

if {B ^ 0) then chosen :— one.element.of{B); 

else chosen :— choose_literal(n , A); 

return chosen; 



Figure 3. Function for Literal Selection with Domain-Specific Heuristics 



(i.e. variable-free) programs. Because however using variables in ASP programs is con- 
venient, programs are first pre-processed by a grounder (LPARSE and GRINGO in the 
systems considered here), which replaces each non-ground rule by the set of its ground 
instances. The main difficulty in implementing our technique in state-of-the-art ASP 
systems is that their grounders often introduce "unnamed atoms" during the grounding 
process. An unnamed atom is an atom that does not occur in the original program, and 
is used internally by the ASP system. Because of their local use, unnamed atoms are 
assigned identifiers that are only valid for the current run of the system. There is no 
guarantee that unnamed atoms will be assigned the same identifiers when the system 
is run on a different problem instance. Because nothing prevents unnamed atoms from 
being used as choice points by the solver, one needs to ensure that unnamed atoms 
are given a unique, known identifier, so that choice-point information regarding them 
can be properly handled. One possible solution is to modify the ASP grounders so that 
unnamed atoms are given identifiers that remain valid across multiple executions. Al- 
though conceptually simple, this solution requires modifying each grounder that one is 
interested in using. In this paper we present instead a relatively simple, indirect method 
that consists of a pre-processing phase and a post-processing phase, and does not in- 
volve modifications to the grounders. 

In LPARSE and GRINGO, unnamed atoms are introduced during the grounding of 
rules containing certain constraint literals, in order to simplify their structurejl For ex- 
ample, the choice rule in the program: 

rp(i).p(2).p(3). 

\ 1{q(X) :p(X)}2. 

is translated by the grounder as: 

{{a(l),a(2),Q(3)}. 
■<- 111. 
Ill ■(- 3{not a(l),not o(2),not o(3)}. 
<- 110- 
110 ^ 3{a(3),a(2),a(l)}. 

^ A thorough explanation of the process is beyond the scope of this paper. We refer the interested 
reader to e.g. |,8J. 



function SOlve.dspec ( n : Program, a : Problem Subclass, X : Set of Instances, 5 : Scaling Factor } 

var 5: Stack of Sets of Extended Literals; 
var B,T : Set of Extended Literals; 
var terminate : Boolean; 

S := 0; B := 0; T := 0; 

terminate :— false; 

while (terminate — false) 

B :— expand{n, B); 

if {B is answer set of iJ) then 
terminate :— true; 

else 

if {B is not consistent or B is complete) then 
if (S = 0) then 
B := ±; 

terminate :— true; 

else 

/* Backtrack */ 
B := top{S); 
S := pop{S); 

end if 

else 

/* Select a choice point */ 

e :— chooseJiteral_dspec{n, a. B, level, T, X, S) ; 

T:=TU{e}; 

S := push(B U {not(e)}, S); 
B := BU {e}; 

end if 

end if 
end while 
return B; 



Figure 4. Search Algorithm for ASP with Domain-Specific Heuristics for Choice-Point 
Selection 



where /^o and /ii are unnamed atoms. As we mentioned above, no assumptions can be 
made about which identifiers are used for the unnamed atoms. If we were for example 
to add to the program a second choice ruleQ, the grounding of the new program could 
use some new identifiers /i2, Ms for the above translation. On the other hand, because 
of the structure of the grounding algorithm, the relative order of the rules belonging to 
the grounding of the choice rule is independent from the changes made to the rest of 
the program. Moreover, whenever multiple unnamed atoms occur in the body of a rule, 
their relative order is independent of changes made to the rest of the program. We will 
make use of these two properties later. 

In the pre-processing phase, the user specifies a name for each rule whose grounding 
may cause the introduction of unnamed atoms. Because we want to avoid modifications 
to the grounder, we cannot extend the syntax of rules to allow specifying a name explic- 
itly0 Therefore, the name of the rule is rather specified in the body of the rule, using a 



Or if the number of ground instances of the choice rule of our example were to change because 
of changes in the problem instance. 
However, a pre-processor can be used, as discussed later. 



special relation 1/0 So, the choice rule above can be written as: 



l{a(X) ■.p(X)}2 



i/(ri). 



(1) 



Generally speaking, given a list, X, of all the free variables in the rule, and some fresh 
constant p, the name is specified by the atom v{p, X). A rule whose name is specified 
as above is called an augmented rule. 

To ensure that the meaning of a rule is not altered by the augmentation, a definition 
of atom h'{-) must also be provided (otherwise the body of the augmented rule is never 
satisfied). Because state-of-the-art grounders usually drop trivially-true atoms from the 
body of the rules, we define the new atom by a choice rule with no bounds and suitable 
domain predicates for the arguments of relation i/, such as { i/(ri) }. (The choice rule 
will be removed later, to avoid affecting the performance of the solver.) When process- 
ing ([T]l, the grounder produces: 



Notice how the unnamed atoms co-occur with the i/( ) atom in the body of some of 
the rules. Because of the structure of the grounding algorithm, this is the case for the 
grounding of any rule that introduces unnamed atoms. The reader should also notice 
that the addition of v{-) atoms to the program can be easily automated. A user could 
then specify a name for the rule using a more convenient syntax, and have a simple 
pre-processor introduce the v{-) atoms in the program as shown above. 

The post-processing phase is based on the algorithm shown in Figure |5] The al- 



fuiiction postp ( G : GroundProgram ) 
Assoc := 0; G' := G; 

for each rule p ^ G and unnamed atom in p 

if p contains an atom I'iX) for some X then 

i :— smallest positive integer such that V/^ (/^ .i'{i,X)) ^ Assoc; 
Assoc :— Assoc U {(/x, X)}} ; 

end if 
end for 

for each atom of the form i^iX) 

Remove from G' rule {iy{X)} ■<— F (for some P) ; 
Remove every occurrence of i^{X) from G'; 

end for 

for each {p, X)) S Assoc 

Replace all occurrences of p in G' by ^{ijX); 

end for 
return G' ; 



gorithm works as follows. First, the ground rules are scanned for co-occurrences of 

^ Notice that the specification of the name of the rule in the body is purely a technical device, 
and should not be intended to convey any semantic information. 




{a(l),a(2),a(3)} ^ i/(ri). 
^ tii,u{ri). 

pi -i— 3{not a(l),not a(2),not a(3)}. 



Figure 5. Post-processing algorithm 



unnamed atoms and u atoms. The goal is to use the information provided by the v 
atoms to give a name to the unnamed atoms they co-occur with. The association of 
names to unnamed atoms is stored in variable Assoc. Because multiple unnamed atoms 
may be introduced by the grounding of a single rule, an extra integer argument is added 
to relation v when naming unnamed atoms. Values for that argument are assigned on 
a first-come, first-serve basis. Because, as we noted above, the relative order of un- 
named atoms in the ground rules does not change, we are guaranteed that the naming 
of unnamed atoms will be consistent throughout multiple runs of the grounder with dif- 
ferent input programs (as long as the domain description remains the same). In the next 
for loop, all V atoms and their definitions are removed from the program. Finally, the 
unnamed atoms are renamed according to the associations encoded by variable Assoc. 

5 Experimental Evaluation 

In this section we discuss an experimental evaluation of the DORS framework. To en- 
sure applicability to a wider variety of cases, we have tested our implementation on 
both abstract problems and on problems from industrial applications of ASP. Here we 
show the results of testing on the 15 puzzle problem and on the task of planning for the 
Reaction Control System of the Space Shuttle. 

The solver used in the experiments is SMODELS, which we modified to obtain im- 
plementations of algorithms solvecp and solvc-dspec. It should be noted that we did 
not use CLASP for our experiments. In fact, extending the DORS framework to CLASP 
is complicated by the fact that this solver is based on conflict-driven clause learning 
(CDCL) (e.g. lH) rather than DPLL. Although we believe that certain similarities be- 
tween DPLL and CDCL make it technically possible to extend the DORS framework 
to CDCL-based systems, work on implementing the DORS framework within CLASP 
is still in the early stages and results will be discussed in a later paper 

In the rest of the discussion, we refer to the implementation of solvc-dspec within 
SMODELS as DSPEC. The grounders used were GRINGO for the 15 puzzle (because the 
original solution of the puzzle used some features specific of GRiNGO's language) and 
LPARSE for the Reaction Control System. It is important to note that this interchange- 
able use of grounders is only possible because of the grounding technique we described 
in the previous section. 

The 15 puzzle problem was one of the benchmarks used for the Second ASP Pro- 
gramming Competition [11|. The description of the puzzle, taken from the competi- 
tion's web site, is shown in Figure|6^. The goal configuration used in the competition is 
shown in Figure ^p. For the domain description, we have used the program published 
on the competition's web siteQ, modified to provide names of select rules, as explained 
earlier Next, for every value of k ranging between 10 and 30, we have generated 100 
random problem instances that can be solved with k moves or less. The subclass that a 
problem instance belongs to is identified by the value of k (i.e. the maximum allowed 
length of a plan for that instance, called maxtime in the original encoding). Next, we 
ran all the instances in each subclass with a timeout value of 6000 sec. The instances 



^ The collection is available from http://www.cs.kuleuven.be/~dtai/events/ 
ASP-competition/problem_instances . tar . gz. 



"/n 15-Puzzle, we have a A x 4, grid 
where there are 15 numbers fl to 15J 
and one blank. The goal is to arrange 
the numbers from their initial config- 
uration to the goal configuration by 
swapping one number at a time with 
its adjacent blank position. Let {x, y) 
be the coordinates of a number on the 
grid and be those of the blank. 

Then {x,y) and are adjacent, if 
\x-i\ + \y-j\ = 1." 
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Figure 6. (a) Description of the 15 Puzzle; (b) Goal Configuration 



that took more than a time threshold tk were then used to leam the domain-specific 
heuristics, while the remaining instances - called hard instances - were used for the 
evaluation phase. In the evaluation phase, we have run the hard instances using the 
learned domain-specific heuristics. The scaling factor S (discussed earlier) was set to 
1. Figure |7] compares the performance of SMODELS and DSPEC for the subclass with 
maxtime = 28, where we set the threshold tk to 70 seconds (selected to have a suffi- 
cient number of samples for learning). The domain-specific heuristics gave an average 
speedup of 6.4 timefl over the standard solver, with a maximum speedup of more than 
24. What's more important, out of 11 instances for which the standard solver timed out, 
aU were solved within the time limit by DSPEC, substantiating our claim that the use of 
domain-specific heuristics helps to make solver's performance more consistent. (Simi- 
lar performance was obtained on other subclasses. We omit the results to save space.) 
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Figure 7. Performance Comparison on the 15 Puzzle. Machine specs: Intel Q6600 CPU, 
2.4GHz, 6GB RAM. 



' This is just the lower bound of the estimate, since SMODELS timed out several times. 



The second problem domain for which we report experimental results is that of 
planning for the Reaction Control System (RCS) of the Space Shuttle. As described 
in e.g. fl4'121, the RCS is the Shuttle's system that has primary responsibility for ma- 
neuvering the Shuttle while it is in space. It consists of fuel and oxidizer tanks, valves, 
and other plumbing needed to provide propellant to the maneuvering jets of the Shuttle. 
The RCS also includes electronic circuitry, both to control the valves in the fuel lines 
and to prepare the jets to receive firing commands. In order to configure the Shuttle for 
an orbital maneuver, the RCS must be configured by opening and closing appropriate 
valves. This is accomplished by either changing the position of the associated switches, 
or by issuing computer commands. In normal conditions, the procedures for the con- 
figuration of the RCS for a given maneuver are known in advance by the astronauts. 
However, if components of the RCS are faulty, then the standard procedures may not 
be applicable. Moreover, because of the amount of possible combinations of faults, it is 
impossible to prepare in advance a set of configuration procedures for faulty situations. 
In those cases, ground control needs to carefully examine the problem and manually 
come up with a configuration procedure. The system described in [14 12) uses a model 
of the RCS, as well as ASP-based reasoning algorithms, to provide ground control with 
a decision-support system that automatically generates configuration procedures for the 
RCS and that can be used when faulty components are present (incidentally, the system 
can also perform diagnostic reasoning lITSI ). 

A collection of problem instances from the domain of the RCS is pubUcly available, 
together with the ASP encoding of the model of the RCS0 The interested reader may 
refer to [14| for a description of the instances. For our testing, we have selected a set 
of 425 instances from the collection, corresponding to the public instances with no 
electrical faults and 3, 8, and 10 mechanical faults respectively, for which a plan of 
length 6 or less (determined by parameter lasttime) was found in the experiments 
discussed in III4I12I . and we have analyzed the performance of the solver on planning 
with maximum lengths ranging between 6 and 10. 

As before, first we ran all the instances with the standard solver and a timeout of 
6000 sec. Of those, the instances that took less than 50 sec were used to learn the 
domain-specific heuristics, while the remaining "hard instances" were used for the eval- 
uation phase. The problem subclasses were defined by the pair {lasttime, maneuver) , 
where lasttime specifies the maximum plan length and maneuver is the maneuver 
that the RCS must be configured for (in our experiments, using the maneuver in the 
subclass definition substantially improved the performance of the learned heuristics). 
Figure m shows the results of the comparison for the 91 hard instances with 8 and 10 
mechanical faults and values of lasttime of 9 and 10. We believe the speedup obtained 
with the domain-specific heuristics is remarkable. First of all, out of 53 instances for 
which the standard solver timed out before finding a solution, in 48 cases the domain- 
specific heuristics allowed to find a solution within the time limit, and in some cases 
in under 10 seconds. The average speedup is 259.2, with a peak of 1253.1 for an in- 



* The files are available from http : / /www .krlab.cs.ttu. edu/ Software /Download/. 
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Figure 8. Performance Comparison on the RCS Domain. Machine specs: Intel i7 CPU, 
2.93GHz, 8GB RAM. 



stance for which SMODELS timed ouj^ and a peak of 544.5 for an instance for which 
SMODELS did not time out. In 6 cases (out of 91) DSPEC performed worse than the 
standard solver We believe that these outliers can be eliminated if more samples are 
made available for learning. 



6 Conclusions 

In this paper we have described a framework that allows learning and using domain- 
specific heuristics for choice-point selection, and we have demonstrated its application 



' The actual speedup could in fact be higher, since SMODELS timed out. As a test, we have let 
SMODELS run on some of these instances for over 60, 000 seconds (16 hours) without getting 
a solution. 



to ASP. Our experimental evaluation has shown that domain- specific heuristics can give 
remarkable speedups, and allow to find answers that cannot otherwise be computed in 
a reasonable amount of time. In the case of the RCS domain, a large number of the 
instances for which the standard solver timed out, could be solver in a matter of seconds 
using the domain-specific heuristics, with an average speedup of more than 2 orders of 
magnitude and peaks of more than 3. This is the type of consistent performance that 
makes a solver viable for industrial apphcations. We believe that an appealing feature 
of the DORS framework is that in principle it can be applied to any solver based on the 
DPLL procedure. Hence, it is possible to extend the approach shown here to other ASP 
solvers, or even to e.g. constraint solvers. Work is also ongoing on extending the DORS 
framework to solvers based on conflict-driven clause leaming, such as CLASP. As a final 
note, we would hke to point out that the method used here to leam the domain-specific 
heuristics is a very simple instance of policy leaming. It will be interesting to investigate 
how more sophisticated techniques from reinforcement learning, but also from machine 
leaming and data mining, can be apphed within the DORS framework. We expect that 
doing so wiU allow to improve performance of the solvers even further. 
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